Distributed Community Detection on Edge-labeled Graphs using Spark
نویسندگان
چکیده
How can we detect communities in graphs with edge-labels, such as time-evolving networks or edge-colored graphs? Unlike classical graphs, edge-labels contain additional information about the type of edges, e.g., when two people got connected, or which company hosts the air route between two cities. We model community detection on edge-labeled graphs as a tensor decomposition problem and propose TeraCom, a distributed system that is able to scale in order to solve this problem on 10x larger graphs. By carefully designing our algorithm and leveraging the Spark framework, we show how to achieve better accuracy (in terms of recovering ground-truth communities) when compared to PARAFAC methods up to 30% increase in NMI. We also present interesting clusters discovered by our system in a flights network.
منابع مشابه
Spectral Clustering and Community Detection in Labeled Graphs
We study spectral clustering techniques to learn community structures in labeled random graphs where edge labels from a label set L = {1, ..., L} are drawn according to discrete probability distributions parametrized by community membership of the two end-nodes of the edge. This is a strict generalization of the standard stochastic block model for community detection.
متن کاملMr-ecocd: an Edge Clustering Algorithm for Overlapping Community Detection on Large-scale Network Using Mapreduce
Overlapping community detection is progressively becoming an important issue in complex networks. Many in-memory overlapping community detection algorithms have been proposed for graphs with thousands of nodes. However, analyzing massive graphs with millions of nodes is impossible for the traditional algorithm. In this paper, we propose MR-ECOCD, a novel distributed computation algorithm using ...
متن کاملA note on 3-Prime cordial graphs
Let G be a (p, q) graph. Let f : V (G) → {1, 2, . . . , k} be a map. For each edge uv, assign the label gcd (f(u), f(v)). f is called k-prime cordial labeling of G if |vf (i) − vf (j)| ≤ 1, i, j ∈ {1, 2, . . . , k} and |ef (0) − ef (1)| ≤ 1 where vf (x) denotes the number of vertices labeled with x, ef (1) and ef (0) respectively denote the number of edges labeled with 1 and not labeled with 1....
متن کامل4-Prime cordiality of some classes of graphs
Let G be a (p, q) graph. Let f : V (G) → {1, 2, . . . , k} be a map. For each edge uv, assign the label gcd (f(u), f(v)). f is called k-prime cordial labeling of G if |vf (i) − vf (j)| ≤ 1, i, j ∈ {1, 2, . . . , k} and |ef (0) − ef (1)| ≤ 1 where vf (x) denotes the number of vertices labeled with x, ef (1) and ef (0) respectively denote the number of edges labeled with 1 and not labeled with 1....
متن کاملDFEP: Distributed Funding-Based Edge Partitioning
As graphs become bigger, the need to efficiently partition them becomes more pressing. Most graph partitioning algorithms subdivide the vertex set into partitions of similar size, trying to keep the number of cut edges as small as possible. An alternative approach divides the edge set, with the goal of obtaining more balanced partitions in presence of high-degree nodes, such as hubs in real wor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016